Gavin Leeper
For independent bands without booking agents, booking tours is a challenging and lengthy process. Even booking one show can involve a saga of research finding venues, trying to get contact info for bookers, reaching out, following up, and hopefully hearing back from one. From the other end, many venues are hesitant to take a gamble on out of towners when they could book dependable local acts instead, resulting in lost revenue on all sides from the risks. To address the risks, venues accept often prefer to book known local acts paired with out of towners, but this is a tough scheduling issue that the bands have to solve on their own. In view of how important both scheduling and (more importantly) band-band fit is to the process of booking shows, I decided to build a system that would recommend bands to each other so that they could play together and create a wonderful experience for each other, the venues they play at, and most importantly their fans. Here’s how I built this system using album purchase data scraped from Bandcamp.
These two bands should play together. Skip to my Process Overview and Results section to find out why.
In the first week of April, I got to go on my first tour with my Oakland-based band, Paper Void. It was a humble operation compared to a professional tour: two nights in Los Angeles, one in San Diego, one in Portland, and a last show at Bottom of The Hill back in San Francisco. We were able to crash with our families the entire time and roll down highway 5 in a bus that our friends happened to own. We all have day jobs back here, so we didn't need to keep to a super tight budget. We did alright for our first tour - we didn’t even lose that much money!
Our trusty tour vessel, The Squidship. Painted by Lauren YS. It broke down right when we got back to Oakland. RIP.
Real tours in industry, I am told, are much more challenging than ours for everyone but the most popular acts. I’ve heard war stories from friends about hitting 42 cities in 45 days with 7 people and their gear crammed into one van, getting one motel room for the entire band each night, $10 per diems that had them eating potato chips for lunch, or getting all their gear stolen. To quote my friend Kyle, "You come back from tour completely drained and demoralized, and yet somehow excited to do the whole thing again the next day." With this little taste of the "road life" and booking our mini-tour, I noticed how important networking is:
When independent bands are trying to get booked at venues, especially away from home, it helps to have friends who are a local band and know the scene. It helps even more if those friends mesh with you musically, and you can approach the venue with a cohesive ticket for the night you are asking about.
Take our show in Portland, for example. Instead of diving into the lengthy research process described above, one of our singers Hannah, who is from Oregon, reached out to a childhood friend in a local band called Space Shark, and we were able to book our gig at Kelly's Olmpian faster than you could say polymetre.
By contrast, even reaching out with friends in a local LA band, we indeed did have to go through the email saga (partially since there are so many bands in LA and getting gigs is very competitive) and we ended up having to plan a house party with them instead. That party was really great too, but likely resulted in less revenue and less exposure than a venue show would have. The economics of it all need to add up a bit better when you are planning a real professional tour. As you can imagine, the sheer difficulty of planning a good tour is why there are professionals who often manage bands' tours for them, and their services can really be worth it even at smaller scales.
When they don't have friends in the area where they want to play, how should bands figure out who to play with or where to play? Or perhaps an even better question, how should smaller bands figure out what cities to play in in the first place? If there are no bands like you in the area in terms of sound and size of following, and no venues that tend to host music like yours, does it make sense to skip a city entirely? No one wants drive out to a faraway college town just to play for 5 people and get little exposure or money for their efforts.
Paper Void at The Mint in Los Angeles. Photo credit: Karen Rosalie
It is with this in mind that I decided to work on what I call the Bandcamp Tour Planner. My idea is to leverage data scraped from Bandcamp to recommend bands to each other that should play together. I ultimately want to build this out for multiple cities. As a first step toward a tour planning tool, however, I decided to look at the smaller question of "What local Bay Area bands should play with each other next?"
There are a few ways to attack this question. I decided to look at number of overlapping fans, i.e. fans who bought one or more albums from both Artist A and Artist B, as a way to measure those artists' similarity to each other.
I scraped data from album pages for some artists I know of in the Bay Area and then from some of best selling albums on Bandcamp with the tags San Francisco, Oakland, or Bay Area, resulting in a dataset of 362 distinct bands and about 24000 users who bought their albums. I rearranged this data into a matrix with all the artists names along the rows as well as the columns. Here, a given entry represents how many fans the pair of artists at that row and column share, and the number of fans an artist shares with themselves is simply their total number of fans. As you can imagine, about 95% of its values are zero.
From there, I used data science methods that I'll describe in more detail in an upcoming technical post to calculate a similarity metric between artists. Here, I processed the data using Factor Analysis and then used a common metric called cosine similarity, which can range from -1 to 1. Each band is perfectly similar to itself, so the diagonal of the matrix is all ones.
Once we have this similarity metric calculated between all of the pairs of bands, we can make recommendations for a given band to play with simply by returning the bands with the highest scores. Even in this relatively small dataset and simple method, I keyed into some relationships that make sense. Let's look at the top 10 similar bands for my friend's band Floral.
Though there are definitely some bands in here that wouldn't be a good fit with Floral at a show (for example Ridgewell and TOPR are more hip-hop beatmaker types), some others that made the top 10 would actually work quite well. From this list, Floral could consider reaching out to Vesper Sails and Feed Me Jack (though FMJ sadly broke up last year) for their next Bay Area show. These bands all fit into that math rock, guitar geek type of aesthetic that I, for one, have been really into lately.
The top 10 similar bands for Tycho here also have noise, but some nice connections.
It's good to see, for example, Christopher Willits come up on this list since he and Tycho are frequent collaborators, have similar sounds, and indeed are both signed to the independent record label Ghostly International.
Two smaller local acts here that Tycho could consider for openers next time he plays in the Bay could include Young God or Marc Kate.
Of course, I couldn't do this without being curious about results for my band.
In the data I collected here, there weren't really that many other Jazz/R&B/Nu-Soul bands available to recommend so not much would have come up if we had just looked for similar tags. In terms of fans that bought both our albums though, I could see it making sense for us to share a bill with Summer Peaks or Big Tree at some point. I think their sounds aren't terribly similar to ours per se, but they might be complimentary in building out a good overall show. I'd be curious to see what results a larger dataset might turn up for us.
We can then visualize these relationships of similarity more generally by building a social graph with Networkx and exporting the result as .gexf for visualization in Gephi. Here, node size is determined by how connected a given band is to other bands in the network. The cosine similarity between two bands serves as the weight of their connection. The color is determined by a statistic called modularity class, which essentially seeks to organize the nodes into different clusters. For this graph plot, I only kept the 10 strongest connections from each node, and the node size is determined by how many times a node is connected to and how strong those connections are.
We can see some of the band clusters that we found above. Floral is, indeed, close to Feed Me Jack, Tycho is close to Christopher Willits, and we're close to Summer Peaks.
If you'd like to zoom in to see more details, you can do so by downloading the pdf here.
In the coming months, I hope to build this tool out such that a band can put their Bandcamp URL in and select a city they want to play in, and it will return the 10 most relevant bands for them to reach out to. I also intend to incorporate similar total size of following, either from total Bandcamp followers or from something like Facebook "likes" or Soundcloud followers. An even better tool might simply email the band whenever a similar act is playing in one of the cities they are tracking à la Bandisintown.
Going by number of overlapping album purchasers is really only a start. This approach addresses the question a bit more robustly than shared tags, for example, since ultimately, we're looking for appeal to similar fan bases. However, if there's too much overlap of fan bases, it might not be optimal to book the bands together. That is, if Artist A and Artist B have too strongly overlapping of fan bases, you don't really get many extra people at the show by booking both of them instead of just one. This is really important to venues, whose economic goal is ultimately to sell their whole club out every night of the week. In future iterations, I intend to experiment with different metrics of similarity.
Another next step could involve integrating Songkick data about what artists have played at what venues and also recommending venues to bands that other bands similar to them have played at in the past. On the venue side, if venues could be convinced to share their sales information, this tool could provide a prediction for the number of tickets a given bill of artists on a given night can be expected to sell (since the same bill would likely draw less people on a Tuesday night, for example, than a Friday). As I refine this tool and make it available online, I hope it will help bands book more great shows, and successful tours.